Constructing a Personal Web Map with Anytime-Control of Web Robots
نویسندگان
چکیده
In this paper, we propose a PWM (Personal Web Map) which is a personal and small database of interesting Web pages to a user, and develop a method to construct it under the user’s control of multiple Web robots. Though general search engine with large databases like YaHoo, AltaVista, MetaCrawler are valid, it is important that a user constructs a small, personal database of relevant Web pages to his/her interest like Bookmarks. For such a Web page database, we propose a PWM: a personal database of interesting Web pages to a user which he/she can control its construction. First a user gives keywords indicating his/her interest to a system, and it constructs a PWM concerned with the keywords. For building a useful PWM, it is necessary that a user can interrupt the construction of a PWM anytime and instruct a sub-field in which a PWM should be expanded more. For this function, we develop an anytime-control algorithm for multiple Web robots. A density distribution blackboard is used, and an uniform distributed PWM is built. Whenever a system is interrupted by a user, it provides a valid PWM in terms of keeping search space wide, and indicates many alternatives on which he/she wants more information. From Web pages in a database, document vectors are generated and used to construct a 2D-map of a PWM by using self-organization maps. A user easily recognizes a PWM through the 2D-map, and gives instruction by clicking a node about which he/she wants more detail information. We made experiments by users and found out that our method outperformed breadth-first search for constructing a useful PWM. As results, a PWM system is considered as a promising approach to assist a user in gathering relevant information in the WWW.
منابع مشابه
A density based clustering approach to distinguish between web robot and human requests to a web server
Today world's dependence on the Internet and the emerging of Web 2.0 applications is significantly increasing the requirement of web robots crawling the sites to support services and technologies. Regardless of the advantages of robots, they may occupy the bandwidth and reduce the performance of web servers. Despite a variety of researches, there is no accurate method for classifying huge data ...
متن کاملEffective Learning to Rank Persian Web Content
Persian language is one of the most widely used languages in the Web environment. Hence, the Persian Web includes invaluable information that is required to be retrieved effectively. Similar to other languages, ranking algorithms for the Persian Web content, deal with different challenges, such as applicability issues in real-world situations as well as the lack of user modeling. CF-Rank, as a ...
متن کاملRepresenting a method to identify and contrast with the fraud which is created by robots for developing websites’ traffic ranking
With the expansion of the Internet and the Web, communication and information gathering between individual has distracted from its traditional form and into web sites. The World Wide Web also offers a great opportunity for businesses to improve their relationship with the client and expand their marketplace in online world. Businesses use a criterion called traffic ranking to determine their si...
متن کاملImage flip CAPTCHA
The massive and automated access to Web resources through robots has made it essential for Web service providers to make some conclusion about whether the "user" is a human or a robot. A Human Interaction Proof (HIP) like Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHA) offers a way to make such a distinction. CAPTCHA is a reverse Turing test used by Web serv...
متن کاملTransforming basic robotic platforms into easily deployable and Web remotely controllable robots
This paper describes a way to transform basic robotic platforms into Web remotely controllable robots. Our goal is to achieve robot deployment anywhere at anytime at low-cost. As soon as full or even restricted Internet access is available (WIFI or 3G), the robot can be deployed and Webcontrolled. The distant user can send commands to the robot and monitor the state of the robot. For example th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Int. J. Cooperative Inf. Syst.
دوره 11 شماره
صفحات -
تاریخ انتشار 1999